Goto

Collaborating Authors

 darker skin tone



Predictive Representativity: Uncovering Racial Bias in AI-based Skin Cancer Detection

Morales-Forero, Andrés, Rueda, Lili J., Herrera, Ronald, Bassetto, Samuel, Coatanea, Eric

arXiv.org Machine Learning

Artificial intelligence (AI) systems increasingly inform medical decision-making, yet concerns about algorithmic bias and inequitable outcomes persist, particularly for historically marginalized populations. This paper introduces the concept of Predictive Representativity (PR), a framework of fairness auditing that shifts the focus from the composition of the data set to outcomes-level equity. Through a case study in dermatology, we evaluated AI-based skin cancer classifiers trained on the widely used HAM10000 dataset and on an independent clinical dataset (BOSQUE Test set) from Colombia. Our analysis reveals substantial performance disparities by skin phototype, with classifiers consistently underperforming for individuals with darker skin, despite proportional sampling in the source data. We argue that representativity must be understood not as a static feature of datasets but as a dynamic, context-sensitive property of model predictions. PR operationalizes this shift by quantifying how reliably models generalize fairness across subpopulations and deployment contexts. We further propose an External Transportability Criterion that formalizes the thresholds for fairness generalization. Our findings highlight the ethical imperative for post-hoc fairness auditing, transparency in dataset documentation, and inclusive model validation pipelines. This work offers a scalable tool for diagnosing structural inequities in AI systems, contributing to discussions on equity, interpretability, and data justice and fostering a critical re-evaluation of fairness in data-driven healthcare.


TrustSkin: A Fairness Pipeline for Trustworthy Facial Affect Analysis Across Skin Tone

Cabanas, Ana M., Pedro, Alma, Mery, Domingo

arXiv.org Artificial Intelligence

-- Understanding how facial affect analysis (F AA) systems perform across different demographic groups requires reliable measurement of sensitive attributes such as ancestry, often approximated by skin tone, which itself is highly influenced by lighting conditions. Using AffectNet and a MobileNet-based model, we assess fairness across skin tone groups defined by each method. Results reveal a severe underrepresentation of dark skin tones ( 2%), alongside fairness disparities in F1-score (up to 0.08) and TPR (up to 0.11) across groups. Grad-CAM analysis further highlights differences in model attention patterns by skin tone, suggesting variation in feature encoding. T o support future mitigation efforts, we also propose a modular fairness-aware pipeline that integrates perceptual skin tone estimation, model interpretability, and fairness evaluation. These findings emphasize the relevance of skin tone measurement choices in fairness assessment and suggest that IT A-based evaluations may overlook disparities affecting darker-skinned individuals. I. INTRODUCTION Predictive algorithms and biometric systems are increasingly used in critical areas such as healthcare, security, and human-computer interaction [1]. However, these systems remain prone to bias arising from demographic imbalances in training data and algorithmic design flaws [1]-[3]. In computer vision applications like EmotionAI and Facial Affect Analysis (FAA), such biases often result in consistent performance disparities across attributes like age, sex, and skin tone [4]-[6]. Given the sensitive deployment of FAA in psychological evaluation, driver monitoring, and educational feedback [1], [7], [8], ensuring fairness, transparency, and robustness across demographic groups is essential.


Bayesian generative models can flag performance loss, bias, and out-of-distribution image content

López-Pérez, Miguel, Miani, Marco, Naranjo, Valery, Hauberg, Søren, Feragen, Aasa

arXiv.org Machine Learning

Generative models are popular for medical imaging tasks such as anomaly detection, feature extraction, data visualization, or image generation. Since they are parameterized by deep learning models, they are often sensitive to distribution shifts and unreliable when applied to out-of-distribution data, creating a risk of, e.g. underrepresentation bias. This behavior can be flagged using uncertainty quantification methods for generative models, but their availability remains limited. We propose SLUG: A new UQ method for VAEs that combines recent advances in Laplace approximations with stochastic trace estimators to scale gracefully with image dimensionality. We show that our UQ score -- unlike the VAE's encoder variances -- correlates strongly with reconstruction error and racial underrepresentation bias for dermatological images. We also show how pixel-wise uncertainty can detect out-of-distribution image content such as ink, rulers, and patches, which is known to induce learning shortcuts in predictive models.


Is thermography a viable solution for detecting pressure injuries in dark skin patients?

Asare-Baiden, Miriam, Jordan, Kathleen, Chung, Andrew, Sonenblum, Sharon Eve, Ho, Joyce C.

arXiv.org Artificial Intelligence

Pressure injury (PI) detection is challenging, especially in dark skin tones, due to the unreliability of visual inspection. Thermography has been suggested as a viable alternative as temperature differences in the skin can indicate impending tissue damage. Although deep learning models have demonstrated considerable promise toward reliably detecting PI, the existing work fails to evaluate the performance on darker skin tones and varying data collection protocols. In this paper, we introduce a new thermal and optical imaging dataset of 35 participants focused on darker skin tones where temperature differences are induced through cooling and cupping protocols. We vary the image collection process to include different cameras, lighting, patient pose, and camera distance. We compare the performance of a small convolutional neural network (CNN) trained on either the thermal or the optical images on all skin tones. Our preliminary results suggest that thermography-based CNN is robust to data collection protocols for all skin tones.


Camera-Based Remote Physiology Sensing for Hundreds of Subjects Across Skin Tones

Tang, Jiankai, Li, Xinyi, Liu, Jiacheng, Zhang, Xiyuxing, Wang, Zeyu, Wang, Yuntao

arXiv.org Artificial Intelligence

Remote photoplethysmography (rPPG) emerges as a promising method for non-invasive, convenient measurement of vital signs, utilizing the widespread presence of cameras. Despite advancements, existing datasets fall short in terms of size and diversity, limiting comprehensive evaluation under diverse conditions. This paper presents an in-depth analysis of the VitalVideo dataset, the largest real-world rPPG dataset to date, encompassing 893 subjects and 6 Fitzpatrick skin tones. Our experimentation with six unsupervised methods and three supervised models demonstrates that datasets comprising a few hundred subjects(i.e., 300 for UBFC-rPPG, 500 for PURE, and 700 for MMPD-Simple) are sufficient for effective rPPG model training. Our findings highlight the importance of diversity and consistency in skin tones for precise performance evaluation across different datasets.


Deepfake detection tools must work with dark skin tones, experts warn

The Guardian > Technology

Detection tools being developed to combat the growing threat of deepfakes – realistic-looking false content – must use training datasets that are inclusive of darker skin tones to avoid bias, experts have warned. Most deepfake detectors are based on a learning strategy that depends largely on the dataset that is used for its training. It then uses AI to detect signs that may not be clear to the human eye. This can include monitoring blood flow and heart rate. However, these detection methods do not always work on people with darker skin tones, and if training sets do not contain all ethnicities, accents, genders, ages and skin-tone, they are open to bias, experts warned.


Deepfake detection tools must work with dark skin tones, experts warn

The Guardian

Detection tools being developed to combat the growing threat of deepfakes – realistic-looking false content – must use training datasets that are inclusive of darker skin tones to avoid bias, experts have warned. Most deepfake detectors are based on a learning strategy that depends largely on the dataset that is used for its training. It then uses AI to detect signs that may not be clear to the human eye. This can include monitoring blood flow and heart rate. However, these detection methods do not always work on people with darker skin tones, and if training sets do not contain all ethnicities, accents, genders, ages and skin-tone, they are open to bias, experts warned.


How AI makes images based on a few words

#artificialintelligence

Humans have designed a range of tools to aid us in the creation of art, and they've evolved dramatically over time. The creative person's tool kit has recently grown with the addition of a formidable new tool: text-to-image generators powered by artificial intelligence. The possibilities of what this novel technology can be used to create are in many ways endless, but that wide range of potential comes at a cost. While some images -- whether cartoon-like doodles or highly realistic scenes that resemble real photographs -- may be creative or inspiring, others could in some cases be harmful or dangerous. When a user enters a handful of key words, these models generate images that combine those concepts in novel ways.


Google Built a Camera to Better Portray People With Darker Skin Tones. Does It?

WSJ.com: WSJD - Technology

Google's artificial intelligence and machine-learning algorithms have in the past been criticized for how they deal with darker skin tones, including mistakenly tagging photos of Black people as gorillas. The company apologized and said it would fix its software. Now, it's using AI to power what it calls "the world's most inclusive camera." The goal of the Real Tone image processing in Google's new Pixel 6 and Pixel 6 Pro smartphones is to "more accurately highlight the nuances of diverse skin tones," especially darker complexions, according to the company's website. The new phones, which begin shipping Thursday, are the first to come with Real Tone.